k-Nearest Neighbors in Uncertain Graphs

نویسندگان

  • Michalis Potamias
  • Francesco Bonchi
  • Aristides Gionis
  • George Kollios
چکیده

Complex networks, such as biological, social, and communication networks, often entail uncertainty, and thus, can be modeled as probabilistic graphs. Similar to the problem of similarity search in standard graphs, a fundamental problem for probabilistic graphs is to efficiently answer k-nearest neighbor queries (k-NN), which is the problem of computing the k closest nodes to some specific node. In this paper we introduce a framework for processing k-NN queries in probabilistic graphs. We propose novel distance functions that extend well-known graph concepts, such as shortest paths. In order to compute them in probabilistic graphs, we design algorithms based on sampling. During k-NN query processing we efficiently prune the search space using novel techniques. Our experiments indicate that our distance functions outperform previously used alternatives in identifying true neighbors in real-world biological data. We also demonstrate that our algorithms scale for graphs with tens of millions of edges.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors

Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...

متن کامل

Non-zero probability of nearest neighbor searching

Nearest Neighbor (NN) searching is a challenging problem in data management and has been widely studied in data mining, pattern recognition and computational geometry. The goal of NN searching is efficiently reporting the nearest data to a given object as a query. In most of the studies both the data and query are assumed to be precise, however, due to the real applications of NN searching, suc...

متن کامل

Mining Uncertain Graphs: An Overview

Graphs play an important role in modern world, due to their widespread use for modeling, representing and organizing linked data. Taking into consideration that most of the “killer” applications require a graph-based representation (e.g., the Web, social network management, protein-protein interaction networks), efficient query processing and analysis techniques are required, not only because t...

متن کامل

A comparative study of performance of K-nearest neighbors and support vector machines for classification of groundwater

The aim of this work is to examine the feasibilities of the support vector machines (SVMs) and K-nearest neighbor (K-NN) classifier methods for the classification of an aquifer in the Khuzestan Province, Iran. For this purpose, 17 groundwater quality variables including EC, TDS, turbidity, pH, total hardness, Ca, Mg, total alkalinity, sulfate, nitrate, nitrite, fluoride, phosphate, Fe, Mn, Cu, ...

متن کامل

Credible Nearest Neighbor Query in Uncertain Network

K-nearest-neighbor query is an important query in uncertain network, which is finding the k close nodes to a specific node. We first put forward the concept of the credible nearest neighbor query in uncertain network, and give credible k-nearest-neighbor query algorithm. Credible distance is used to describe the distance between nodes in uncertain network. Fuzzy simulation is adopted to decreas...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PVLDB

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2010